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Distributed Wideband Spectrum Sensing 

Thomas Kealy^, Oliver Johnson^ and Robert Piechocki^ 


Abstract —We consider the problem of reconstructing wide¬ 
band frequency spectra from distributed, compressive measure¬ 
ments. The measurements are made by a network of nodes, 
each independently mixing the ambient spectra with low fre¬ 
quency, random signals. The reconstruction takes place via local 
transmissions between nodes, each performing simple statistical 
operations such as ridge regression and shrinkage. 

1. Introduction 

There is an almost ubiquitous growing demand for mobile 
and wireless data, with consumers demanding faster speeds 
and better quality connections in more places. Consequently 
4G is now being rolled out in the UK and US and with 5G 
being planned for 2020 and beyond iSl . 

However, there is constrained amount of frequencies over 
which to transmit this information; and demand for frequencies 
that provide sufficient bandwidth, good range and in-building 
penetration is high. 

Not all spectrum is used in all places and at all times, and 
judicious spectrum management, by developing approaches to 
use white spaces where they occur, would likely be beneficial. 

Broadly, access to spectrum is managed in two, comple¬ 
mentary ways, namely through licensed and licence exempt 
access. Licensing authorises a particular user (or users) to 
access a specific frequency band. Licence exemption allows 
any user to access a band provided they meet certain technical 
requirements intended to limit the impact of interference on 
other spectrum users. 

A licence exempt approach might be particularly suitable for 
managing access to white spaces. Devices seeking to access 
white spaces need a robust mechanism for learning of the 
frequencies that can be used at a particular time and location. 
One approach is to refer to a database, which maps the location 
of white spaces based on knowledge of existing spectrum 
users. An alternative approach is for devices to detect white 
spaces by monitoring spectrum use. 

The advantages of spectrum monitoring Q over persisting 
a database of space-frequency data are the ability of networks 
to make use of low-cost low-power devices, only capable 
of making local (as opposed to national) communications, 
keeping the cost of the network low and opportunistic channel 
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usage for bursty traffic, reducing channel collisions in dense 
networks. 

The realisation of any Cognitive Radio standard (such as 
IEEE 802.22 QO)), requires the co-existence of primary (TV 
users) and secondary (everybody else who wants to use TVWS 
spectrum) users of the frequency spectrum to ensure proper 
interference mitigation and appropriate network behaviour. We 
note, that whereas TVWS bands are an initial step towards 
dynamic spectrum access, the principles and approaches we 
describe are applicable to other frequency bands - in particular 
it makes ultra-wideband spectrum sensing possible. 

The challenges of this technology are that Cognitive Ra¬ 
dios (CRs) must sense whether spectrum is available, and 
must be able to detect very weak primary user signals. 
Furthermore they must sense over a wide bandwidth (due to 
the amount of TVWS spectrum proposed), which challenges 
traditional Nyquist sampling techniques, because the sampling 
rates required are not technically feasible with current RF or 
Analogue-to-Digital conversion technology. 

Due to the inherent sparsity of spectral utilisation. Com¬ 
pressive Sensing (CS) (H is an appropriate formalism within 
which to tackle this problem. CS has recently emerged as a 
new sampling paradigm allowing images to be taken from a 
single pixel camera for example. Applying this to wireless 
communication, we are able to reconstruct sparse signals at 
sampling rates below what would be required by Nyquist 
theory, for example the works 171, and ifT^ detail how this 
sampling can be achieved. 

However, even with CS, spectrum sensing from a single 
machine will be costly as the proposed TVWS band will 
be over a large frequency range (for instance in the UK the 
proposed TVWS band is from 470 MHz to 790 MHz, requiring 
traditional sampling rates of '-1600 MHz). CS at a single 
sensor would still require high sampling rates. In this paper we 
propose a distributed model, which allows a sensing budget at 
each node far below what is required by centralised CS. The 
main contribution of this paper is that the model can be solved 
in a fully distributed manner - we do not require a central 
fusion centre as in Ea. Moreover, we are able to show that 
the set of updates at each nodes takes closed form. 

The structure of the paper is as follows: in section |IJ we 
introduce the sensing model, in section [Iv] we describe the 
distributed reconstruction algorithm |[8|, and finally in section 
[V| we show some results of the reconstruction quality of this 
model. 

H. Model 

We consider a radio environment with a single primary user 
(PU) and a network of J nodes collaboratively trying to sense 


and reconstruct the PU signal, either in a fully distributed 
manner (by local communication), or by transmitting measure¬ 
ments to a fusion centre which then solves the linear system. 

We try to sense and reconstruct a wideband signal, divided 
into L channels. We have a (connected) network of J (= 50) 
nodes placed uniformly at random within the square [0,1] x 
[0,1]. This is the same model, as in |[T3]| . The calculations 
which follow are taken from Ga as well. 

The nodes individually take measurements (as in (71) by 
mixing the incoming analogue signal x{t) with a mixing 
function pi (t) aliasing the spectrum, x (t) is assumed to be 
bandlimited and composed of up to k uncorrelated transmis¬ 
sions over the L possible narrowband channels - i.e. the signal 
is /c-sparse. 

The mixing functions - which are independent for each node 
- are required to be periodic, with period Tp. Since pi is 
periodic it has Fourier expansion: 


Pii't)= Ciiexpljlt 


l = — OC 


27r 


(II-.l) 


The Cii are the Fourier coefficients of the expansion and 
are defined in the standard manner. The result of the mixing 
procedure in channel i is therefore the product xpi, with 
Fourier transform (we denote the Fourier Transform of x by 
X(;)): 


Xi (/) = X (t) Pi (t) dt 

J —oo 

oo 

^ CiiX{f-lfp) (II-.2) 

l= — oo 

(We insert the Fourier series for pi, then exchange the sum 
and integral). The output of this mixing process then, is a 
linear combination of shifted copies of X (/), with at most 
IfNyQ/fp] terms since X (/) is zero outside its support (we 
have assumed this Nyquist frequency exists, even though we 
never sample at that rate). 

This process is repeated in parallel at each node so that each 
band in x appears in baseband. 

Once the mixing process has been completed the signal in 
each channel is low-pass filtered and sampled at a rate fs > fp- 
In the frequency domain this is a ideal rectangle function, so 
the output of a single channel is: 

~\~Lq 

Yi ^ diX if - Ifp) (II-.3) 

l=-Lo 

since frequencies outside of [—/s/2, /s/2] will filtered out. 
I/O is the smallest integer number of non-zero contributions in 
X (/) over [-/s/2, /s/2] - at most [/Arl"Q//pl if we choose 
fs = fp- These relations can be written in matrix form as: 


coefficients, a partial Fourier Matrix, and a matrix of channel 
coefficients, x is the vector of unknown samples of x{t). 
i.e. A can be written: 


^mxL (gmXZ/jjiZ/XZ/j^Z/XZ/jjZ/XZ/ 

The measurements y are transmitted to a Fusion Centre via 
a control channel. The system II-.4 can then be solved (in the 
sense of finding the sparse vector x by convex optimisation 
via minimising the objective function: 


i||Ax-y||i +A||x||i 


(II-.6) 


where A is a parameter chosen to promote sparsity. Larger 
A means sparser x. 


III. ADMM 

The alternating direction method of multipliers [O, 
(ADMM), algorithm solves problems of the form 


argmin/(a;) + g (z) 

X 

s.t Ux^Vz = c (III-.7) 

where / and g are assumed to be convex function with range 
in M, f/ e and V g are matrices (not assumed 

to have full rank), and c eMP. 

ADMM consists of iteratively minimising the augmented 
Lagrangian 


Lp (x, z,ri) = f (x) +g (z) + (Ux + Vz - c) + 

(t] is a Lagrange multiplier), and p is a parameter we can 
choose to make g{z) smooth (91, with the following iterations: 


,/c+l 

;= argminl/p {x,z^,g^) 

(III-.8) 

./c+1 

;=argminLp {x'^+^, z,g'^) 

(III-.9) 

/+1 

:=g^ + p {Ux^+^ + - c) 

(III-. 10) 


The alternating minimisation works because of the decom- 
posability of the objective function: the x minimisation step 
is independent of the ^ minimisation step and vice versa. 

We illustrate an example, relevant to the type of problems 
encountered in signal processing. 

A. Example: ADMM for Centralised LASSO 

ADMM can be formulated as an iterative MAP estimation 
procedure for the problem (which is referred to as LASSO see 

CD): 


y = Ax + w (II-.4) 

where y contains the output of the measurement process, 
and A is a product matrix of the mixing functions, their Fourier 


^\\Ax - b\\l + X\\x\\i 
This can be cast in constrained form as: 


(III-A. 11) 



(III-A.12) 

(III-A.13) 


sign{x) = 0 


(III-A.22) 


l\\Ax-b\\l + X\\z\U 

sX z = X 


i.e this is of the form ( |III-.7| ) with f (x) = \\Ax — yW^, 
g (z) = All^lli, U = LV = -L and c = 0 . 

The associated Lagrangian is: 


Lp = ]^\\Ax-b\\l+X\\z\\i+ri{x - z)-\-f^\\x-zf (III-A.14) 

Now, given a set of noisy measurements (say of radio 
spectra) y, and a sensing matrix A we can use ADMM to 
find the (sparse) radio spectra. 

The ADMM iterations for LASSO, which can be found by 
alternately differentiating ( |III-A.14| ) with respect io x,z and g, 
are (in closed form): 


x'^+^ := [A^A + pi) ^ {A^b + p - y^)) (III-A.15) 

■■= Sx/p (III-A.16) 

^k+i .-yf (III-A.17) 

where Sx/p (o) is the soft thresholding operator: {x)- = 

sign(a;i) {\xi\ - 7 )'^. 

This algorithm has a nice statistical interpretation: it it¬ 
eratively performs ridge regression, followed by shrinkage 
towards zero. This is the MAP estimate for x under a Laplace 
prior. 

The soft-thresholding operator can be derived by consider¬ 
ing the MAP estimate of the following model: 

y = X w (III-A.18) 

where x is some (sparse) signal, and w is additive white 
Gaussian noise. We seek 

X = arg max ¥x\y{x\y) (III-A. 19) 

This can be recast in the following form by using Bayes rule, 
noting that the denominator is independent of x and taking 
logarithms: 


X = arg max [log {y — x) X- log ¥{x)] (III-A.20) 

X 

The term ¥n{y — x) arises because we are considering 
X w with w zero mean Gaussian, with variance So, 
the conditional distribution of y (given x) will be a Gaussian 
centred at x. 

We will take P(x) to be a Laplacian distribution: 


P(x) = 


V 2 , 


x/2i 


(III-A.21) 


a 


Note that / (x) = logPa;(x) — ^\x\, and so by differen¬ 
tiating f' (x) = — ^sign(x) 

Taking the maximum of |III-A.20 we obtain: 


y - X _ ^ 


Which leads the soft thresholding operation defined earlier, 

7/2 cr^ 

with 7 = ^ " as (via rearrangement): 


y = x + 


V2al 


sign {x) 


or 


X (y) = sign( 2 /) 


i.e S^{y). 




IV. Constrained Optimisation on Graphs 

We model the network as an undirected graph G = (V^E), 
where V = {1... J} is the set of vertices, and E = V xV is 
the set of edges. An edge between nodes i and j implies that 
the two nodes can communicate. The set of nodes that node 
i can communicate with is written A/i and the degree of node 
i is Ei — |A/^|. 

Individually nodes make the following measurements: 


Yp = ApX + Up (IV-.23) 

where Ap is the row of the sensing matrix from 
( |II-.4| ), and the system ( |II-.4| ) is formed by concatenating the 
individual nodes’ measurements together. 

We assume that a proper colouring of the graph is available: 
that is, each node is assigned a number from a set C = 
{1... c}, and no node shares a colour with any neighbour. 

To find the x we are seeking, to each node we give a copy 
of X, Xp and we constrain the copies to be indentical across all 
edges in the network. We can write the combined optimisation 
variable as x, which collects together C copies of a nx 1 vector 
x: 

Definition 1. We define vectors Xc, where c = 1,..., C and 
write the vector of length nJ: 
c 

X = ^ Wc<^ Xc = 

C=1 

where Wc(^i) = I(c(i) = c), I is the indicator function, and we 
have written c{i) for the colour of the ith node. 

The problem then is to solve: 




■ ^ ^ciJ) 


(IV-.24) 


A 

arg min EE W^jXj-yjWl + 7:||t||i 

^ c=i jec 

and Xi = Xj if {i,j} G E 
and Xi = Zi \/i e {1,..., C} (IV-.25) 

These constraints can be written more compactly by intro¬ 
ducing the node-arc incidence matrix B: sl V by E matrix 
where each column is associated with an edge {i,j) G E 





















and has 1 and —1 in the ith and jth entry respectively. 
Figures ( [IV. 1| ) and ( |IV.2j ) show examples of a network and 
it’s associated incidence matrix. 

Definition 2. 


U \= (g) In) X 

C 

= 0 In) y^Wc®Xc 

c=l 

= ^Bl ®Xc 

C=1 

where we have used the definition ( |IV-.24| ) in the second line, 
and the property of Krone eke r products {A 0 C){B 0 D) = 
(AB^CD) between the second and third lines, and we write 
Be = B. 

The constraint Xi = Xj if {i, j} G E can now be written 

c 

Y, ® In) xc = 0 (IV-.26) 

C=1 
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Figure IV.2. The incidence matrix associated with Figure ( |IV.ll 


= EE. - %ii 2 +/ 3 ihiiii) 

C=1 jEc 
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7] (B^ 0 In) Xq 2 \\Xc '^c|l2 
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^(Bj0/„)5, 


C=1 


(IV-.28) 


The term (\\Ylc=i {^c ^ ^n) Xc\\‘^) of ( |IV-.28| ), can be 
decomposed, using the following lemma: 


note that (5^(8)/^) G Together ( |IV-.24| ) and 

( |IV-.26| ), suggests that the problem ( |IV-.25j ) can be re-written 
as: 


c 

arg min 

^ c=i jeCc 

c 

S.t. ^ (g) In) Xc = 0 

c=l 

and Xc — Zc = 0 (IV-.27) 

where /3 = j. 



Figure IV. 1. An example of a network 


Lemma IV.l. 


C=1 


and 



DA\x,\\l - Y 

keNj 


xJ x^ 


(IV-.29) 


c 

9 ^ E '^In)xi=Y E 

c=i leCcmeNi 

(IV-.30) 

where p is decomposed edge-wise: p = (...,77^^,...), such 
that pij = pj^i, and is associated with the constraint Xi = Xj. 

Proof. 

U^U=EE (^<=1 ® ® Xc) 

Ci=l C2 = l 

= Y Bc,Bj^®x'^^Xc^ 

Cl,C2 

BB^ is Si J X J matrix, with the degree of the nodes on 
the main diagonal and —1 in position if nodes i and j 
are neighbours (i.e BB^ is the graph Laplacian). Hence, since 
we can write BB^Wc 2 ’ the trace of Bc^ B^^ is 

simply the sum of the degrees of nodes with colour 1. 

For Cl C 2 , Bc^BJ^ corresponds to an off diagonal block 
of the graph Laplacian, and so counts how many neighbours 
each node with colour 1 has. 

Finally, note that p G and can be written: 


The Augmented Lagrangian for the problem ( |IV-.27| ) can 
be written down as: 


c 

7] = Y'^c<^11c (IV-.31) 

C=1 





























where r]c is the vector of Lagrange multipliers associated 
across edges from colour c. Now 

c c 

ifu = L! X! (g) 

Ci=l C2 = l 

by the properties of Kronecker products, and the definition of 
Be. For Cl = C 2 , r]^u is zero, as there are no edges between 
nodes of the same colour b definition. For ci 7 ^ C 2 , r]^u counts 
the edges from ci to C 2 , with the consideration that the edges 
from C 2 to Cl are counted with opposite parity. □ 

Adding together this with the lemma, lets us write ( [IV-.28| ) 
as: 

c 

Lp = (^\\AjXj — yj\\2 + ^Ikilli) + ^ 

c=ijeCc 

+ ^Di\\x^\f + ^\\x^-z^\\^ (IV-.32) 

where we have defined: 


where is the result of the algorithm at iteration k, and 
Z* is the optimal solution. 

These results indicate that for both centralised and dis¬ 
tributed solvers, adding noise to the system results in a de¬ 
grading of performance. Interestingly note, that the distributed 
solver seems to (slightly) outperform the centralised solver at 
all SNRs. This is counter-intuitive, as it would be expected 
that centralised solvers knowing all the available information 
would outperform distributed solutions. We conjecture that the 
updates described in section take into account differences 
in noise across the network. The distributed averaging steps, 
which form the new prior for each node, then penalise updates 
from relatively more noisy observations. This corroborates 
observations from El. 

This observation is (partially) confirmed in figure (??), 
which plots the progress of the centralised and distributed 
solvers (as a function of iterations) towards the optimum 
solution. The SNR is 0.5 (i.e the signal is twice as strong as the 
noise). Note that after around 300 iterations, the MSB of the 
distributed solver is consistently below that of the centralised 
solver. 


= 



sign {k - i) 


- pxk 


(IV-.33) 


this is a rescaled version of the Lagrange multiplier, g, 
which respects the graph structure. 

Then by differentiating pV-.32| ) with respect to Xj and Zj 
we can find closed forms for the updates as: 


Theorem 1. 

x’;+^ := {AjAj + {pDj + 1)/)“' {Ajvj + 


;= 9j + p - z'"+^) 

\meNj J 


(IV-.34) 

(IV-.35) 

(IV-.36) 

(IV-.37) 



V. Results 

The model described in section equation was 

simulated, with a wideband signal of 201 channels and a 
network of 50 nodes (i.e. the signal will be sampled at a 
1/4 of rate predicted by Nyquist theory). The mixing patterns 
were generated from iid Gaussian sources (i.e the matrix S 
had each entry drawn from an iid Gaussian source). Monte 
Carlo simulations were performed at SNR values ranging from 
5 to 20, and the expected Mean Squared Error (MSB) of 
solutions of a centralised solver (spgll) and a distributed solver 
(ADMM) were calculated over 10 simulations per SNR value. 
The results can be seen in fig ( |V.4| ). 

The MSB was calculated as follows: 



Figure V.3. Mse vs SNR for the sensing model, with AWGN only, showing 
the performance of distributed and centralised solvers 

VI. Conclusions 

We have demonstrated an alternating direction algorithm for 
distributed optimisation with closed forms for the computation 
at each step, and discussed the statistical properties of the 
estimation. 

We have simulated the performance of this distributed 
algorithm for the distributed estimation of frequency spectra, 
in the presence of additive (white, Gaussian) and multiplicative 
(frequency fiat) noise. We have shown that the algorithm 
is robust to a variety of SNRs and converges to the same 
solution as an equivalent centralised algorithm (in relative 
mean-squared-error). 
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Figure V.4. Mse vs SNR for the sensing model, showing the performance of 
distributed and centralised solvers 


error iterations 



Figure V.5. The progress of the distributed solver as a function of the number 
of iterations, with different values of the regression parameter A 

We plan to work on larger, more detailed, models for 
the frequency spectra and to accelerate the convergence via 
Nesterov type methods to smooth the convergence of the 
distributed algorithm (61 . Specifically, we seek to dampen the 
ringing seen in Figure [V^ 



Figure V.6. The progress of a distributed (blue) and a centralised (green) 
solver as a function of the number of iterations. The value of A = 0.1 
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